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ABSTRACT 


The  Chebychev  (also  Minimax  and  L»  Norm)  criterion  has  been  widely  studied 
as  a  method  for  curve  fitting.  Published  conputer  codes  are  available  to 
obtain  the  optimal  parameter  estimates  to  fit  a  linear  function  to  a  set  of 
given  points  under  the  Chebychev  criterion.  The  purpose  of  this  paper  is  to 
study  procedures  for  obtaining  the  best  subset  of  k  parameters  from  a  given 
set  of  m  parameters  where  k  is  less-than-or-egual-to  m. - 


KEY  WORDS 


Least  Absolute  Value 
Regression 
Linear  Programming 
Best  Subset 


"WIS  CSMI 
DtIC  *  * 

Urujnan-a-fii 

justifies'-1 


— - - - - 

jietrtbutlon/  _  . - - 

lv» liability  C«i*« _ 

‘  jive 11  aoA/or 

Bill  Special 

.  t  i 


Introduction 


The  classical  linear  curve  fitting  problem  in  the  Lp  norm  can  be  stated  as 
follows.  Given  (y^,  x^  x^,  . ..,  xim>'  1  "  1,  2,  . ...  n,  determine  8  to 
solve  the  problem 

Minimize  (.£  |y.  -  x.  . 

i=1  Ji  j=1  13  3 

When  p=2,  the  problem  is  least  squares,  when  p=1,  the  problem  is  least  abso¬ 
lute  values  and  when  lim  p*®,  the  problem  is  minimize  the  maximum  value  which 
is  also  called  the  Chebychev  curve  fitting  problem.  A  comparison  of  these 
three  criteria  can  be  found  in  {1,  2].  Least  squares  (L2  norm)  is  certainly 
the  most  popular  approach  in  the  statistical  community,  although  under  certain 
conditions  least  absolute  value  and  Chebychev  criteria  are  preferred.  Least 
absolute  values  provides  a  maximum  liklihood  estimate  when  the  errors  have  a 
double-exponential  distribution  and  works  well  enpirically  when  outliers  are 
present  in  the  data  (see  Gentle  [10]  and  Dielman  and  Pf af fenberger  [7]  ).  The 
Chebychev  estimates  are  maximum  likelihood  when  errors  are  uniformly  distrib¬ 
uted  and  work  well  enpirically  with  most  fat-tailed  error  distributions  (see 
Appa  and  Smith  [1])  and  Rabinowitz  [13]).  This  paper  will  present  conputa- 
tional  procedures  to  solve  a  best  subset  problem  under  Chebychev  criterion. 

The  problem  can  be  stated  as  follows. 

For  q=k,  k+1,  . . . ,  m,  determine  values  for  8  and  J  which 


Minimize  {Maximum! 


,Z 

jcj 


[J]  =q 


(1) 


where  J  C  {1,  2,  ...,  m)  and  [J]  is  the  cardinality  of  the  index  set  J.  In 
other  words,  consider  all  possible  combinations  of  exactly  q  parameters  taken 
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from  the  original  m  parameters,  and  choose  the  combination  yielding  the  smal¬ 
lest  maximm  absolute  deviation. 

The  best  subset  problem  arises  frequently  in  statistical  analysis  where  it 
is  desirable  to  recognize  influential  variables  and  study  the  effect  of  reduc¬ 
ing  the  number  of  variables.  Draper  and  Smith  [81  give  an  excellent  overview 
of  this  modeling  technique.  Solution  algorithms  exist  for  the  best  subset 
problem  when  the  least  absolute  value  [5,  12]  and  least  squares  [11]  criteria 
are  used;  however,  there  appear  to  be  no  algorithms  available  in  the  public 
domain  when  the  Chebychev  criterion  is  used. 

The  algorithm  for  the  best  subset  problem  in  the  Chebychev  norm  uses  a 
branch-and-bound  technique.  A  binary  tree  is  formed  and  each  node  of  the  tree 
corresponds  to  a  curve  fitting  problem  with  a  specified  set  of  parameters 
included  in  the  model.  Problems  are  solved  using  the  algorithm  of  Armstrong 
and  Kung  [3,  4]?  however,  because  of  available  bounds,  not  all  problems  need 
be  solved  to  optimality.  The  framework  of  the  enumeration  is  similar  to  that 
given  by  Armstrong  and  Kung  [5]  for  the  best  subset  least  absolute  value 
problem.  This  framework  is  reviewed  in  the  next  section  and  placed  in  the 
Chebychev  curve  fitting  context. 

Algorithimic  Framework 

At  any  stage  of  the  enumeration  procedure  a  problem  of  the  form  given  by 
(1)  is  being  considered.  Rewriting  (1)  in  a  linear  programming  equivalent 
statement  yields: 

Minimize  2 
subject  to 


^-z<  +  l”-  2 . . 


(21 


where,  at  optimality,  2  will  have  the  value  of  the  maximum  absolute  residual 


The  linear  programming  dual  of  (2)  is  the  following. 

n  n 

Maximize  W  =  y.  if'  +  .  £,  y.  it" 
i= 1  l  i  i=1  i  i 


(3) 


subject  to 


.£,  x,  •  <  +  x-  •  0  ieJ 

i=1  ii  i  i=1  n  i 


.1  if'  -  .  £  it"  =  1 

i=1  l  i=1  i 


it ,  ^  0,  ^  0,  l —  1 ,  2,  . « « ,  n 


It  is  easily  shown  that  (2)  and  (3)  will  always  have  finite  optimal  objective 
values  which  are  equal.  Also,  the  simplex  algorithm  for  linear  programming 
problems  will  readily  provide  optimal  it  values  for  (3)  once  (2)  is  solved  and, 
similarly,  the  optimal  &  values  are  available  once  (3)  is  solved.  Thus, 
computational  considerations  alone  should  determine  whether  (2)  or  (3)  should 
be  solved  with  a  primal  simplex  algorithm. 

Special  purpose  sinplex  algorithms  have  been  developed  for  both  (2)  and 
(3).  Computational  experience  [13]  has  indicated  that  algorithms  that  main¬ 
tain  a  feasible  solution  to  (3)  are  superior  to  those  that  maintain  a  feasible 
solution  to  (2).  The  algorithm  of  Armstrong  and  Rung  [3,  4]  will  be  used  to 
solve  (3).  This  uses  a  reduced  basis,  may  pass  through  more  that  one  extreme 


point  during  an  iteration  and  has  a  reduced  ratio  test.  The  objective  value  is 


monotonically  nondecreasing  from  iteration  to  iteration  and  this  character¬ 
istic  is  particularly  attractive  when  solving  the  best  subset  problem. 

An  outline  of  a  step-by-step  solution  procedure  for  the  best  subset  prob¬ 
lem  will  now  be  stated  which  is  independent  of  the  method  used  to  solve  (3). 
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STEP  1. 


STEP  2. 


STEP  3 


STEP  4 


STEP  5 


STEP  6 


STEP  7 


l 
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STEPS  OP  ALGORITHM 

Set  q  =  m;  =  00  ,  i  =  k,  ...,  m;  £  =  0;  J  =  {l,  2,  ...,  m},  and 
STAT  j  =  0,  j  -  1,  2,  m 

Solve  (2)  to  obtain  an  optimal  solution  (z,  3),  where  fj  =  0  for 
j^j.  If  z  >  SADg»  then  go  to  STEP  4;  otherwise,  go  to  STEP  3. 

A  better  solution  has  been  found  for  a  subset  with  q  parameters 

included  in  the  model.  Set  z  =  z  and  save  3. 

<? 

If  q  <  k  then  go  to  STEP  6,  otherwise,  set  q  *  q  -  1  and  £,<-£+  1. 

Find  a  parameter  Bu  with  STATy  =  0  and  form  a  new  subproblem  with 

3  =0.  Set  STAT  =  -1  and  remove  u  from  J.  Go  to  STEP  2. 

u  u 

If  PAR.£  >  0  then  go  to  STEP  7;  otherwise,  set  PAR„  <-  PAR^  j  =  PAR„, 

STAT j  =  i  and  q  «•  q  +1.  Go  to  STEP  4. 

.  Set  J  =  PAR  ,  STATj  =  0  and  £  «-  £  -1.  If  t  >  0  then  go  to  STEP  6; 

otherwise,  terminate  the  enumeration  process. 


VARIABLE  DEFINITIONS 


***  nUmber  °f  Parameters  in  the  current  subproblem. 


Ibe  current  best  objective  with  i 


parameters. 


STAT  * 
J 


o  i-th  poster  is  in  th.  model  but  has  not  b„„  forced  ln_ 


a  free  parameter 

1  j-th  parameter  is  forced  in  the  model 

-1  i~th  Parameter  is  forced  out  of  the  model 


PAR*  The  Parameter  restricted  at  level  1  of  the  predecessor  path,  if  PA^ 
1.  negative,  the  pester  is  forced  out  of  the  model  end  if  MR  ls‘ 
positive,  th.  parameter  is  forced  in  the  model.  ‘ 

'  CU"ent  U~1  ln  ““  «».•  Ihe  initial  problem  is  at 

level  sere  and  a  node  is  on.  level  deeper  in  the  tree  than  the  imme- 
diate  predecessor. 
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One  trivial  modification  to  the  algorithm  is  to  force  a  to  be  included 
in  every  regression.  This  can  be  accomplished  by  setting  STATr  equal  to  1 
rather  than  0  at  STEP  1. 

The  optimal  solution  of  (1)  is  not  used  when  z  >  z  ;  thus  it  is  not  neces- 

q 

sary  to  solve  (1)  if  it  can  be  ascertained  that  z  >  z  by  another  test.  This 

q 

additional  test  is  easily  implemented  when  (3)  is  solved  using  a  primal  sim¬ 
plex  algorithm.  The  algorithm  will  maintain  a  feasible  solution  to  (3)  and 
the  objective  value  (w)  will  be  monotonically  nondecreasing  from  iteration  to 
iteration.  Therefore,  the  solution  process  can  be  terminated  whenever  the 
following  holds. 

w  >  z  (4) 

q 

A  key  aspect  in  any  branch-and-bound  algorithm  is  the  sequence  in  which 
the  subproblems  are  considered,  'ttie  sequence  should  be  based  on  the  following 
guidelines. 

A)  A  good  solution  for  a  subproblem  at  any  stage  should  be  easily 
obtained  to  facilitate  the  solution  of  the  subproblem. 

B)  A  good  solution  for  each  subset  size  should  be  obtained  as  soon  as 
possible  to  maximize  the  influence  of  condition  (4). 

The  next  section  discusses  hew  to  inplement  the  branch-and-bound  algorithm 
to  accommodate  guidelines  A  and  B. 

PENALTY  CALCULATIONS  AND  AN  ADVANCED  START 

The  previously  described  branch-and-bound  algorithm  uses  a  last-in-first- 
out  (LIFO)  branching  rule.  Viewing  the  algorithm  in  a  tree  format,  every  node 
corresponds  to  a  linear  programming  problem.  Two  problems  are  formed  by 
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considering  the  "current"  problem,  and 


removing  a  parameter  from  the  model  on  one  branch  and  forcing  the  same  param¬ 
eter  to  be  included  in  the  model  on  the  other  branch.  Once  a  condition  is 
specified,  it  must  be  satisfied  in  all  descendants  of  the  node. 

The  branch  were  8^  is  forced  in  the  model  gives  rise  to  the  same  linear 
programming  problem  in  the  immediate  predecessor  node.  Hius,  the  problem  at 
node  b  of  the  diagram  need  not  be  solved.  Hie  problem  of  concern  arises  when 
some  8^  is  removed  from  the  model.  Let  (3)  represent  the  problem  at  node  a. 
Setting  8^  =  0  in  (2)  is  the  same  as  removing  the  r-th  constraint  from  (3). 
The  problem  at  node  c  written  by  modifying  problem  at  node  a  is  the  following. 
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n  n 

Maximize  w  =  .  £  _  y.  n'  +  .£„  y.  n',' 
i=1  1  l  i=1  l  i 


subject  to 


n 

n 

i£i 

x.  . 

13 

7T  *  + 

i 

ili 

X.  .  40 

13  i 

=  0, 

jeJ, 

n 

n 

ili 

tx' 

i 

-  Ji 

i 

=  1 

n 

n 

i£i 

x. 

ir 

IX '  + 

l 

ili 

X.  Tx'  * 

ir  i 

+  s 

r 

=  0 

ir' 

i 

>  0, 

it'  '  < 
i 

0, 

±-1f  2, 

n 

j*r 


(5) 


The  logical  linear  programming  variable  Sr  is  unrestricted  in  sign  and, 
hence,  has  the  effect  of  eliminating  the  r-th  constraint.  Also,  Sr  will 
always  appear  in  the  basis  of  an  optimal  solution  to  (6).  Given  a  basic 
feasible  solution  to  the  problem  at  node  a,  a  basic  feasible  solution  to  the 
problem  at  node  c  can  be  formed  by  "conceptually"  performing  a  simplex  itera¬ 
tion  to  bring  Sr  into  the  basis.  Once  Sr  is  brought  into  the  basis  the  r-th 
constraint  is  dropped  from  the  problem  and  the  structure  is  given  by  (3).  Hie 
index  set  J  used  at  node  c  is  formed  by  taking  the  index  set  J  used  at  node  a 
and  removing  the  index  r. 

The  objective  function  change  incurred  by  bringing  Sr  into  the  basis 
provides  a  penalty  on  the  restriction  8^  =  0.  In  other  words,  this  is  a  lower 
bound  on  the  total  objective  change  when  going  from  node  a  to  node  c. 

The  penalty  can  be  calculated  using  a  reduced  basis  from  the  immediate 
predecessor  and  a  modification  of  the  ratio  procedures  described  in  [4].  A 


formal  statement  of  the  penalty  procedures  will  not  be  given  as  it  requires 


excessive  notation  and  a  firm  knowledge  of  the  algorithm  in  [4].  For  purposes 
of  the  presentation  here  it  is  necessary  to  realize  that  the  penalty  calcula¬ 
tion  provides  the  following  two  inportant  pieces  of  information. 

1.  The  objective  change  during  the  iteration  which  brings  Sr  into  the 
basis  is  determined. 

2.  The  dual  variable  (ir)  to  leave  the  basis  during  this  iteration  is 
determined.  Hie  feasibility  of  (3)  is  maintained  after  sr  enters  the 
basis. 

(In  the  conputer  incrementation  of  the  algorithm,  Sr  is  never  created 
explicitly,  rather,  the  dimension  of  the  basis  is  decreased  by  one.) 

The  information  obtained  from  the  penalty  calculations  can  be  used  to 
develop  the  solution  tree.  This  topic  will  be  discussed  in  the  next  section. 

Incrementation  and  Conputatlonal  Testing 

The  algorithm  for  the  best  parameter  subset  using  a  Chebychev  curve  fit¬ 
ting  criterion  was  coded  in  FORTRAN  and  various  implementations  were  tested. 
The  initial  incrementation  had  the  following  characteristics. 

1.  Hie  parameter  chosen  to  restrict  at  a  node  was  the  free  parameter 
with  the  smallest  index. 

2.  Each  subproblem  was  solved  to  optimality  without  utilizing  any  infor¬ 
mation  from  the  problems  solved  at  preceding  nodes. 

The  first  variation  made  to  the  algorithm  was  to  drop  the  requirement  that 
each  subproblem  be  solved.  If  the  objective  value  of  the  current  subproblem 
was  not  less  than  the  best  objective  value  found  thus  far  for  the  associated 


subset  size,  then  the  algorithm  returned  to  the  branching  process.  The  com¬ 
parison  was  made  immediately  before  updating  the  linear  programming  basis. 
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The  solution  time  was  cut  by  more  than  one  half  by  this  sinple  check.  Thus, 
all  future  testing  included  this  feature. 

The  second  variation  was  to  use  the  last  solution  from  the  immediate 
predecessor  as  a  starting  solution  for  the  current  subproblem.  The  procedure 
outlined  in  the  previous  section  was  implemented  to  determine  the  variable  to 
remove  from  the  basis  when  a  constraint  is  removed  from  (3).  This  required 
saving  the  indices  of  the  variables  in  the  final  basis  of  each  subproblem  so 
the  LU  decomposition  [6]  of  this  basis  could  be  reconstructed. 

Since  the  algorithm  used  a  last-in-first-out  branching  rule,  the  recon¬ 
struction  of  the  basis  was  only  necessary  when  backtracking  took  place.  This 
advanced  start  also  cut  solution  times  by  more  than  one  half  and  was  included 
in  all  future  versions. 

The  final  alterations  to  the  algorithm  involved  the  use  of  the  penalties 
to  guide  in  the  construction  of  the  solution  tree.  It  was  hypothesized  that 
the  maximum  benefit  from  comparing  the  objective  value  of  current  subproblem 
against  the  incumbent  objective  value  would  be  derived  by  obtaining  the  best 
solutions  early  in  the  enumeration  process.  Hius,  assuming  the  objective 
change  during  the  first  pivot  reflected  the  overall  objective  change,  the  free 
parameter  with  the  smallest  penalty  should  be  chosen  to  be  restricted. 

Table  1  gives  a  comparison  of  run  times  for  solving  a  set  of  randomly 
generated  problems  with  the  smallest  penalty  and  first  penalty  branding  rule 
implementation.  All  problems  were  randomly  generated  with  the  errors  having  a 
uniform  distribution  and  solved  on  the  170/750  Dual  Cyber  at  the  University  of 
Texas  at  Austin.  All  times  are  reported  for  solution  only  and  do  not  include 
input-output.  The  iteration  count  reported  gives  for  iterations  within  the 
algorithm  of  [3]  for  solving  the  subproblems.  All  variables  were  single 
precision  with  the  Cyber's  60-bit  word  and  the  tolerance  value  for  zero  was 


-  12  - 

set  at  1.E-8.  Choosing  to  restrict  the  free  parameter  with  smallest  penalty 
was,  overall,  not  as  good  a  strategy  as  choosing  the  first  free  parameter. 

The  superiority  of  the  first  free  parameter  rule  became  more  pronounced  as  the 
problem  dimensions  increased.  It  is  felt  that  the  poor  performance  of  the 
smallest  penalty  rule  came  from  the  extra  work  required  to  determine  the 
parameter  to  restrict  and  from  the  purely  local  information  given  by  the 
penalty.  A  similar  result  has  been  observed  in  the  penalties  from  integer 
programming  [9] . 

The  next  phase  of  testing  considered  the  effect  of  limiting  the  smallest 
subset  size  and  not  requiring  the  verification  of  optimality.  Table  2  shows 
the  results  with  the  two  larger  values  of  m  and  a  solution  within  95,  98  or 
100  percent  of  optimality  guaranteed.  The  use  of  the  smallest  penalty  branch¬ 
ing  rule  did  not  seem  to  provide  any  better  suboptimal  solutions  than  the 
first  free  parameter  option  for  this  problem  size.  However,  for  the  smaller 
dimension  problems  the  smallest  penalty  rule  frequently  provided  the  optimal 
solutions.  Table  3  shows  the  effect  of  limiting  the  number  of  parameters  in 
the  smallest  subset  (k)  to  5,  10  and  15  rather  than  1.  The  growth  of  solution 
times  is  approximately  exponential  with  the  decrease  in  the  value  of  k.  This 
is  to  be  expected  because  of  the  tree  search  strategy. 

The  final  computational  results  displayed  in  tabular  form  compares  the 
solution  times  for  the  Chebychev  best  subset  problem  with  times  for  the  least 
absolute  value  subset  problem.  The  algorithm  of  [5] ,  called  L1LU,  was  used 
for  the  least  absolute  value  problems.  The  code  L1LU  was  consistent  in 
required  close  to  three  times  the  CPU  seconds  than  the  first  free  parameter 
code  for  the  Chebychev  norm  problem.  This  time  differential  is  similar  to 
that  observed  for  solving  a  single  curve  fitting  problem  with  the  two  norms. 
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Other  branching  strategies  were  tested  without  any  notable  results.  The 
rules  attempted  were  the  following. 

1 .  The  parameter  chosen  to  restrict  was  the  one  yielding  the  largest 
penalty. 

2.  During  the  first  descent  of  the  tree,  the  parameter  chosen  to 
restrict  yielded  the  smallest  penalty,  thereafter,  the  parameter 
yielding  the  largest  penalty  was  chosen. 

3.  The  pseudo-cost  procedure  of  [9]  was  modified  to  the  problem  at  hand 
and  used  to  choose  the  parameter  the  restrict. 

The  maximum  penalty  performed  poorly  and  the  other  two  strategies  were  at 
times  better  than  the  smallest  penalty;  however,  the  first  free  parameter 
branching  rule  remained  the  best. 

Conclusions 

This  paper  has  presented  an  algorithm  for  the  best  parameter  subset  using 
a  Chebychev  curve  fitting  criterion.  Computational  results  with  variations  of 
the  fundamental  branch-and-bound  procedure  indicate  that  the  use  of  penalties 
to  develop  the  tree  is  not  worth  the  additional  labor  for  most  problems. 
Solution  times  grow  exponentially  with  the  number  of  parameters  but  show  a 
slow  linear  growth  based  on  the  number  of  observations.  The  largest  problem 
solved  during  the  study  had  20  parameters,  300  observations  and  the  smallest 
subset  size  considered  was  10.  It  seems,  at  this  time,  prohibitive  to  con¬ 
sider  the  smallest  subset  size  to  be  one  and  determine  the  best  subset  for 
problems  with  m  greater-than-or-equal-to  20. 

One  modification  that  would  certainly  increase  the  speed  of  the  algorithm 
is  to  save  the  complete  LU  decomposition  at  each  node  rather  than  just  the 
indices  of  columns  in  the  basis.  For  large  problems,  a  significant  amount  of 
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time  is  spent  reconstructing  previusly  obtained  LU  decompositions.  The  addi¬ 
tional  storage  required  to  save  previous  LU  decompositions  would,  however, 
limit  the  size  of  problems  that  could  be  solved.  In  our  implementations  it 
was  felt  that  the  savings  of  space  was  more  important  than  the  savings  in 
time. 

Curve  fitting  with  a  Chebychev  criterion  is  often  a  desirable  alternative 
to  other  curve  fitting  criteria.  The  ability  to  analyze  the  best  parameter 
subsets  using  the  Chebychev  criterion  is  provided  by  the  algorithm  presented 
here.  Although  this  paper  has  only  dealt  with  algorithmic  procedures  for  the 
best  subset  problem,  the  foundation  for  simulation  and  empirical  studies  of 
curve  fitting  problems  is  made  available. 

The  computer  code  version  of  the  algorithm  presented  in  this  paper  is 
available  from  the  authors. 
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TABLE  1 


n  = 

200 

n  = 

250 

n  = 

300 

Smallest 

First 

Smallest 

First 

Smallest 

First 

Penalty 

Penalty 

Penalty 

Penalty 

Penalty 

Penalty 

m  =  5 

.09 

.12 

.14 

.13 

.15 

.16 

(21) 

(27) 

(24) 

(26) 

(23) 

(24) 

m  =  10 

2.42 

2.23 

3.18 

2.78 

4.12 

3.44 

(  154) 

(186) 

(240) 

(205) 

(281) 

(251) 

m  =  15 

104 

79 

151 

136 

156 

99 

(2921) 

(  1504) 

(5805) 

(5515) 

(4718) 

(2124) 

A  computational  comparison  of  two  implementations  of  the  best  subset  algorithm 
for  the  Chebychev  norm  is  given.  The  Upper  entry  in  each  cell  is  the  mean  CPU 
time  in  seconds  and  the  lower  entry  is  mean  number  of  iterations.  Three 
problems  were  solved  with  each  combination  of  m  and  n  when  m  equals  5  or  10,  a 
single  problem  was  solved  when  m  equals  15. 
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TABLE  2 


m  =  15 

m  =  20 

k  =*  1 

k  =  10 

Smallest 

First 

First 

Penalty 

Penalty 

Penalty 

95% 

84 

74 

2195 

(1072) 

(808) 

(1083) 

98% 

121 

85 

2415 

(2793) 

1299) 

(6400) 

100% 

156 

99 

2678 

(4718) 

(2124) 

(10244) 

A  computational  comparison  of  two  implementations  of  the  best  subset  algorithm 
for  the  Chebychev  norm  with  three  percentages  of  optimality  guaranteed  is 
given.  The  upper  entry  in  each  cell  is  the  CPU  time  in  seconds  and  the  lower 
entry  is  the  number  of  iterations.  All  problems  had  the  value  of  n  set  at 


300 
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TABLE  3 


m  = 

15 

m  = 

20 

Smallest 

First 

Smallest 

First 

Penalty 

Penalty 

Penalty 

Penalty 

k  =  5 

130 

90.3 

DNR 

DNR 

(5205) 

(2285) 

k  =  10 

36.1 

25 

DNR 

2678 

(1391) 

(660) 

( 10244) 

k  =  15 

00 

CM 

• 

00 

CM 

• 

201 

174 

(35) 

(35) 

(96) 

(452) 

DNR  =  Did  Not  Run 


A  computational  comparison  of  two  implementations  of  the  best  subset  algorithm 
for  the  Chebychev  norm  with  three  minimum  set  sizes  is  given.  The  upper  entry 
in  each  cell  is  CPU  time  in  seconds  and  the  lower  entry  is  number  of  itera¬ 
tions.  All  problems  had  the  value  of  n  set  at  300  and  100%  of  optimality 
guaranteed. 
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TABLE  4 


Smallest 

Penalty 

n  =  200 
First 
Penalty 

lilu 

Smallest 

Penalty 

n  =  300 

First 

Penalty 

LILU 

m  =  10 

2.42 

2.23 

7.11 

4.12 

3.44 

10.23 

(154) 

(186) 

(3410) 

(218) 

(251) 

(3987) 

m  =  15 

104 

79 

145 

156 

99 

308 

(2921) 

(  1504) 

(13515) 

(4718) 

(2124) 

(69271; 

An  algorithm  for  the  best  subset  least  absolute  value  problem  is  compared 
against  an  algorithm  for  the  best  subset  Chebychev  norm  problem.  The  upper 
entry  in  each  cell  is  mean  CPU  time  and  the  lower  entry  is  mean  number  of 
iterations.  Three  problems  were  solved  when  m  equaled  10  and  a  single  problem 
when  m  equals  15.  All  solutions  had  100%  of  optimality  was  guaranteed. 
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